196 ◾ Bioinformatics
perform differential analysis. Since our primary goal is to study the difference between the
normal and tumor samples, we can construct a contrast using “makeContrasts” function
and then we can conduct the statistical test using the “glmQLFTest” function as follows:
my.contrasts
<- makeContrasts(conditiontumo-conditionnorm,levels=design)
fitq <- glmQLFit(yNorm, design)
qlfq<- glmQLFTest(fitq,contrast=my.contrasts)
topTags(qlfq, n=10, adjust.method=”BH”, sort.by=”PValue”,
p.value=0.05)
The “qlfq” is a DGELRT object that stores the results of a GLM-based differential expres-
sion analysis for DGE data. The “topTags” function prints the top ten (n = 10) set of the
most significantly differential genes as shown in Figure 5.22. The p-value threshold is set
to “p.value=0.05” so only genes with p-value less than 0.05 will be listed. The negative log-
fold changes (logFC) represent genes that are downregulated (down-expressed) in tumor
samples over normal sample; logCPM is the log count-per-million; F is the test statistic
for the null hypothesis that no difference in the gene expression between the normal and
tumor samples; pvalue is the significance measure (p-value < 0.05 is significant); and FDR
is the false discovery rate.
To use GLM negative binomial model instead of the quasi-negative binomial model for
the differential expression, you can use the following script:
FIGURE 5.22 The top ten significantly expressed genes.